Youssef Skouri
Baha Khayati
Taysir Fattoumi
When we talk about “process analysis” we think, is there any alternative instead of relying solely on workshops, interviews or outdated process documents?. Especially when our world today is heavily reliant on computing technology it would only makes sense if there is a more effective and reliable solution to do the trick.
Process mining techniques allow for extracting information from event logs. For example, the audit trails of a workflow management system or the transaction logs of an enterprise resource planning system can be used to discover models describing processes, organizations, and products. Moreover, it is possible to use process mining to monitor deviations (e.g., comparing the observed events with predefined models or business rules in the context of SOX)
library(bupaR)
## Warning: package 'bupaR' was built under R version 3.4.4
## Loading required package: edeaR
## Warning: package 'edeaR' was built under R version 3.4.4
## Loading required package: eventdataR
## Warning: package 'eventdataR' was built under R version 3.4.4
## Loading required package: processmapR
## Warning: package 'processmapR' was built under R version 3.4.4
## Loading required package: xesreadR
## Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
## logical.return = TRUE, : there is no package called 'xesreadR'
## Loading required package: processmonitR
## Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
## logical.return = TRUE, : there is no package called 'processmonitR'
## Loading required package: petrinetR
## Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
## logical.return = TRUE, : there is no package called 'petrinetR'
##
## Attaching package: 'bupaR'
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:utils':
##
## timestamp
library(edeaR)
library(processmapR)
library(eventdataR)
library(readr)
## Warning: package 'readr' was built under R version 3.4.4
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.4.4
## -- Attaching packages ---------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0 v purrr 0.2.5
## v tibble 1.4.2 v dplyr 0.7.6
## v tidyr 0.8.1 v stringr 1.4.0
## v ggplot2 3.1.0 v forcats 0.3.0
## Warning: package 'ggplot2' was built under R version 3.4.4
## Warning: package 'tibble' was built under R version 3.4.4
## Warning: package 'tidyr' was built under R version 3.4.4
## Warning: package 'purrr' was built under R version 3.4.4
## Warning: package 'dplyr' was built under R version 3.4.4
## Warning: package 'stringr' was built under R version 3.4.4
## Warning: package 'forcats' was built under R version 3.4.4
## -- Conflicts ------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks bupaR::filter(), stats::filter()
## x dplyr::lag() masks stats::lag()
library(DiagrammeR)
## Warning: package 'DiagrammeR' was built under R version 3.4.4
library(ggplot2)
library(stringr)
library(lubridate)
## Warning: package 'lubridate' was built under R version 3.4.4
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
data = read.csv("credit_file +.csv",sep = ",",header = T)
data_act <- readr::read_csv("credit_file +.csv",
locale = locale(date_names = 'en',
encoding = 'ISO-8859-1'))
## Parsed with column specification:
## cols(
## .default = col_character(),
## Variant_index = col_double(),
## `(case)_RequestedAmount` = col_double(),
## Accepted = col_logical(),
## CreditScore = col_double(),
## FirstWithdrawalAmount = col_double(),
## MonthlyCost = col_double(),
## NumberOfTerms = col_double(),
## OfferedAmount = col_double(),
## Selected = col_logical(),
## starttimestamp = col_datetime(format = ""),
## endtimestamp = col_datetime(format = "")
## )
## See spec(...) for full column specifications.
## Warning: 36773 parsing failures.
## row col expected actual file
## 174 -- 24 columns 1 columns 'credit_file +.csv'
## 175 -- 24 columns 1 columns 'credit_file +.csv'
## 176 -- 24 columns 1 columns 'credit_file +.csv'
## 177 -- 24 columns 1 columns 'credit_file +.csv'
## 178 -- 24 columns 1 columns 'credit_file +.csv'
## ... ... .......... ......... ...................
## See problems(...) for more details.
head(data)
## Case_ID Activity Resource
## 1 Application_652823628 A_Create Application User_1
## 2 Application_652823628 A_Submitted User_1
## 3 Application_652823628 A_Concept User_1
## 4 Application_652823628 W_Complete application User_17
## 5 Application_652823628 A_Accepted User_52
## 6 Application_652823628 O_Create Offer User_52
## Start_Timestamp Complete_Timestamp Variant Variant_index
## 1 2016/01/01 10:51:15.304 2016/01/01 10:51:15.304 Variant 2 2
## 2 2016/01/01 10:51:15.352 2016/01/01 10:51:15.352 Variant 2 2
## 3 2016/01/01 10:52:36.413 2016/01/01 10:52:36.413 Variant 2 2
## 4 2016/01/02 11:45:22.429 2016/01/02 11:45:22.429 Variant 2 2
## 5 2016/01/02 12:23:04.299 2016/01/02 12:23:04.299 Variant 2 2
## 6 2016/01/02 12:29:03.994 2016/01/02 12:29:03.994 Variant 2 2
## X.case._ApplicationType X.case._creditGoal X.case._RequestedAmount
## 1 New credit Existing credit takeover 20000
## 2 New credit Existing credit takeover 20000
## 3 New credit Existing credit takeover 20000
## 4 New credit Existing credit takeover 20000
## 5 New credit Existing credit takeover 20000
## 6 New credit Existing credit takeover 20000
## Accepted Action CreditScore EventID EventOrigin
## 1 <NA> Created NA Application_652823628 Application
## 2 <NA> statechange NA ApplState_1582051990 Application
## 3 <NA> statechange NA ApplState_642383566 Application
## 4 <NA> Obtained NA Workitem_1875340971 Workflow
## 5 <NA> statechange NA ApplState_99568828 Application
## 6 true Created 979 Offer_148581083 Offer
## FirstWithdrawalAmount MonthlyCost NumberOfTerms OfferID OfferedAmount
## 1 NA NA NA <NA> NA
## 2 NA NA NA <NA> NA
## 3 NA NA NA <NA> NA
## 4 NA NA NA <NA> NA
## 5 NA NA NA <NA> NA
## 6 20000 498.29 44 <NA> 20000
## Selected lifecycle.transition starttimestamp endtimestamp
## 1 <NA> complete 2016-01-01T09:51:15Z 2016-01-01T09:51:15Z
## 2 <NA> complete 2016-01-01T09:51:15Z 2016-01-01T09:51:15Z
## 3 <NA> complete 2016-01-01T09:52:36Z 2016-01-01T09:52:36Z
## 4 <NA> start 2016-01-02T10:45:22Z 2016-01-02T10:45:22Z
## 5 <NA> complete 2016-01-02T11:23:04Z 2016-01-02T11:23:04Z
## 6 true complete 2016-01-02T11:29:03Z 2016-01-02T11:29:03Z
head(data_act)
## # A tibble: 6 x 24
## Case_ID Activity Resource Start_Timestamp Complete_Timest~ Variant
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Applic~ A_Creat~ User_1 2016/01/01 10:~ 2016/01/01 10:5~ Varian~
## 2 Applic~ A_Submi~ User_1 2016/01/01 10:~ 2016/01/01 10:5~ Varian~
## 3 Applic~ A_Conce~ User_1 2016/01/01 10:~ 2016/01/01 10:5~ Varian~
## 4 Applic~ W_Compl~ User_17 2016/01/02 11:~ 2016/01/02 11:4~ Varian~
## 5 Applic~ A_Accep~ User_52 2016/01/02 12:~ 2016/01/02 12:2~ Varian~
## 6 Applic~ O_Creat~ User_52 2016/01/02 12:~ 2016/01/02 12:2~ Varian~
## # ... with 18 more variables: Variant_index <dbl>,
## # `(case)_ApplicationType` <chr>, `(case)_creditGoal` <chr>,
## # `(case)_RequestedAmount` <dbl>, Accepted <lgl>, Action <chr>,
## # CreditScore <dbl>, EventID <chr>, EventOrigin <chr>,
## # FirstWithdrawalAmount <dbl>, MonthlyCost <dbl>, NumberOfTerms <dbl>,
## # OfferID <chr>, OfferedAmount <dbl>, Selected <lgl>,
## # `lifecycle:transition` <chr>, starttimestamp <dttm>,
## # endtimestamp <dttm>
#Changing date variables to appropriate types
data_act$starttimestamp = as.POSIXct(data_act$Start_Timestamp, format = "%Y/%m/%d %H:%M:%S")
data_act$endtimestamp = as.POSIXct(data_act$Complete_Timestamp,
format = "%Y/%m/%d %H:%M:%S")
data$Complete_Timestamp=data_act$endtimestamp = as.POSIXct(data$Complete_Timestamp,
format = "%Y/%m/%d %H:%M:%S")
data$Activity_Instance_ID = seq(1,nrow(data))#needded for eventlog
str(data)
## 'data.frame': 400000 obs. of 25 variables:
## $ Case_ID : Factor w/ 57165 levels "Application_1000086665,A_Accepted,User_5,2016/08/05 15:57:07.419,2016/08/05 15:57:07.419,Variant 1,1,New credit"| __truncated__,..: 47006 47006 47006 47006 47006 47006 47006 47006 47006 47006 ...
## $ Activity : Factor w/ 26 levels "","A_Accepted",..: 6 10 5 23 2 14 15 18 21 4 ...
## $ Resource : Factor w/ 132 levels "","User_1","User_10",..: 2 2 2 52 91 91 91 91 91 91 ...
## $ Start_Timestamp : Factor w/ 363222 levels "","2016/01/01 10:51:15.304",..: 2 3 4 125 147 158 159 160 161 162 ...
## $ Complete_Timestamp : POSIXct, format: "2016-01-01 10:51:15" "2016-01-01 10:51:15" ...
## $ Variant : Factor w/ 2917 levels "","Variant 1",..: 1031 1031 1031 1031 1031 1031 1031 1031 1031 1031 ...
## $ Variant_index : int 2 2 2 2 2 2 2 2 2 2 ...
## $ X.case._ApplicationType: Factor w/ 3 levels "","Limit raise",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ X.case._creditGoal : Factor w/ 14 levels "","Boat","Business goal",..: 7 7 7 7 7 7 7 7 7 7 ...
## $ X.case._RequestedAmount: num 20000 20000 20000 20000 20000 20000 20000 20000 20000 20000 ...
## $ Accepted : Factor w/ 3 levels "","false","true": NA NA NA NA NA 3 NA NA NA NA ...
## $ Action : Factor w/ 5 levels "","Created","Deleted",..: 2 5 5 4 5 2 5 5 4 5 ...
## $ CreditScore : int NA NA NA NA NA 979 NA NA NA NA ...
## $ EventID : Factor w/ 363228 levels "","Application_1000158214",..: 16706 60870 129762 318353 154191 161638 208779 235898 351445 150834 ...
## $ EventOrigin : Factor w/ 4 levels "","Application",..: 2 2 2 4 2 3 3 3 4 2 ...
## $ FirstWithdrawalAmount : num NA NA NA NA NA 20000 NA NA NA NA ...
## $ MonthlyCost : num NA NA NA NA NA ...
## $ NumberOfTerms : int NA NA NA NA NA 44 NA NA NA NA ...
## $ OfferID : Factor w/ 28025 levels "","Offer_1000096910",..: NA NA NA NA NA NA 7178 7178 NA NA ...
## $ OfferedAmount : num NA NA NA NA NA 20000 NA NA NA NA ...
## $ Selected : Factor w/ 3 levels "","false","true": NA NA NA NA NA 3 NA NA NA NA ...
## $ lifecycle.transition : Factor w/ 3 levels "","complete",..: 2 2 2 3 2 2 2 2 3 2 ...
## $ starttimestamp : Factor w/ 234909 levels "","2016-01-01T09:51:15Z",..: 2 2 3 84 99 108 109 110 110 110 ...
## $ endtimestamp : Factor w/ 220855 levels "","2016-01-01T09:51:15Z",..: 2 2 3 80 92 101 102 103 103 103 ...
## $ Activity_Instance_ID : int 1 2 3 4 5 6 7 8 9 10 ...
head(data)
## Case_ID Activity Resource
## 1 Application_652823628 A_Create Application User_1
## 2 Application_652823628 A_Submitted User_1
## 3 Application_652823628 A_Concept User_1
## 4 Application_652823628 W_Complete application User_17
## 5 Application_652823628 A_Accepted User_52
## 6 Application_652823628 O_Create Offer User_52
## Start_Timestamp Complete_Timestamp Variant Variant_index
## 1 2016/01/01 10:51:15.304 2016-01-01 10:51:15 Variant 2 2
## 2 2016/01/01 10:51:15.352 2016-01-01 10:51:15 Variant 2 2
## 3 2016/01/01 10:52:36.413 2016-01-01 10:52:36 Variant 2 2
## 4 2016/01/02 11:45:22.429 2016-01-02 11:45:22 Variant 2 2
## 5 2016/01/02 12:23:04.299 2016-01-02 12:23:04 Variant 2 2
## 6 2016/01/02 12:29:03.994 2016-01-02 12:29:03 Variant 2 2
## X.case._ApplicationType X.case._creditGoal X.case._RequestedAmount
## 1 New credit Existing credit takeover 20000
## 2 New credit Existing credit takeover 20000
## 3 New credit Existing credit takeover 20000
## 4 New credit Existing credit takeover 20000
## 5 New credit Existing credit takeover 20000
## 6 New credit Existing credit takeover 20000
## Accepted Action CreditScore EventID EventOrigin
## 1 <NA> Created NA Application_652823628 Application
## 2 <NA> statechange NA ApplState_1582051990 Application
## 3 <NA> statechange NA ApplState_642383566 Application
## 4 <NA> Obtained NA Workitem_1875340971 Workflow
## 5 <NA> statechange NA ApplState_99568828 Application
## 6 true Created 979 Offer_148581083 Offer
## FirstWithdrawalAmount MonthlyCost NumberOfTerms OfferID OfferedAmount
## 1 NA NA NA <NA> NA
## 2 NA NA NA <NA> NA
## 3 NA NA NA <NA> NA
## 4 NA NA NA <NA> NA
## 5 NA NA NA <NA> NA
## 6 20000 498.29 44 <NA> 20000
## Selected lifecycle.transition starttimestamp endtimestamp
## 1 <NA> complete 2016-01-01T09:51:15Z 2016-01-01T09:51:15Z
## 2 <NA> complete 2016-01-01T09:51:15Z 2016-01-01T09:51:15Z
## 3 <NA> complete 2016-01-01T09:52:36Z 2016-01-01T09:52:36Z
## 4 <NA> start 2016-01-02T10:45:22Z 2016-01-02T10:45:22Z
## 5 <NA> complete 2016-01-02T11:23:04Z 2016-01-02T11:23:04Z
## 6 true complete 2016-01-02T11:29:03Z 2016-01-02T11:29:03Z
## Activity_Instance_ID
## 1 1
## 2 2
## 3 3
## 4 4
## 5 5
## 6 6
eventlog = data %>% #a data.frame with the information in the table above
eventlog(
case_id = "Case_ID",
activity_id = "Activity",
activity_instance_id = "Activity_Instance_ID",
lifecycle_id = "lifecycle.transition",
timestamp = "Complete_Timestamp",
resource_id = "Resource"
)
## Warning: package 'bindrcpp' was built under R version 3.4.4
eventlog %>% summary
## Number of events: 400000
## Number of cases: 57165
## Number of traces: 4802
## Number of distinct activities: 26
## Average trace length: 6.997289
##
## Start eventlog: NA
## End eventlog: NA
## Case_ID Activity Resource
## Length:400000 : 36773 User_1 : 49376
## Class :character O_Create Offer : 28024 : 36773
## Mode :character O_Created : 28024 User_49: 7778
## O_Sent (mail and online): 25938 User_29: 6932
## W_Validate application : 25480 User_3 : 6805
## A_Validating : 25013 User_10: 6694
## (Other) :230748 (Other):285642
## Start_Timestamp Complete_Timestamp
## : 36773 Min. :2016-01-01 10:51:15
## 2016/01/08 19:56:43.212: 2 1st Qu.:2016-03-19 17:33:35
## 2016/01/29 09:10:58.778: 2 Median :2016-06-07 10:39:49
## 2016/03/02 15:15:40.745: 2 Mean :2016-05-28 17:17:46
## 2016/07/11 15:17:14.450: 2 3rd Qu.:2016-08-03 03:16:04
## 2016/07/15 13:09:09.433: 2 Max. :2017-01-26 10:11:10
## (Other) :363217 NA's :36773
## Variant Variant_index X.case._ApplicationType
## : 36773 Min. : 1.0 : 36773
## Variant 1: 27588 1st Qu.: 7.0 Limit raise: 36394
## Variant 2: 19570 Median : 25.0 New credit :326833
## Variant 3: 14685 Mean : 390.3
## Variant 5: 10878 3rd Qu.: 257.0
## Variant 8: 10008 Max. :3159.0
## (Other) :280498 NA's :36773
## X.case._creditGoal X.case._RequestedAmount Accepted
## Car :116956 Min. : 0 : 36773
## Home improvement : 95296 1st Qu.: 6000 false: 8359
## Existing credit takeover: 71011 Median : 12000 true : 19665
## : 36773 Mean : 15618 NA's :335203
## Unknown : 32069 3rd Qu.: 20000
## Not speficied : 14204 Max. :450000
## (Other) : 33691 NA's :36773
## Action CreditScore EventID
## : 36773 Min. : 0.0 : 36773
## Created : 48416 1st Qu.: 0.0 Application_1000158214: 1
## Deleted : 27791 Median : 0.0 Application_1000311556: 1
## Obtained : 54569 Mean : 319.9 Application_1000339879: 1
## statechange:232451 3rd Qu.: 851.0 Application_100034150 : 1
## Max. :1142.0 Application_1000557783: 1
## NA's :371976 (Other) :363222
## EventOrigin FirstWithdrawalAmount MonthlyCost
## : 36773 Min. : 0 Min. : 43.0
## Application:154460 1st Qu.: 0 1st Qu.: 150.0
## Offer :126407 Median : 5000 Median : 232.8
## Workflow : 82360 Mean : 7681 Mean : 273.9
## 3rd Qu.:10996 3rd Qu.: 340.8
## Max. :75000 Max. :6673.8
## NA's :371976 NA's :371976
## NumberOfTerms OfferID OfferedAmount
## Min. : 5 : 36773 Min. : 5000
## 1st Qu.: 56 Offer_1000226917: 4 1st Qu.: 8000
## Median : 74 Offer_1000329580: 4 Median :15000
## Mean : 82 Offer_1000373613: 4 Mean :17820
## 3rd Qu.:120 Offer_1000572979: 4 3rd Qu.:24000
## Max. :180 (Other) : 98367 Max. :75000
## NA's :371976 NA's :264844 NA's :371976
## Selected lifecycle.transition starttimestamp
## : 36773 : 36773 : 36773
## false: 13915 complete:308658 2016-03-28T12:15:55Z: 10
## true : 14109 start : 54569 2016-02-16T16:25:24Z: 9
## NA's :335203 2016-03-03T07:01:13Z: 9
## 2016-03-09T08:44:36Z: 9
## 2016-04-28T06:00:26Z: 9
## (Other) :363181
## endtimestamp Activity_Instance_ID .order
## : 36773 Length:400000 Min. :1e+00
## 2016-03-28T12:15:55Z: 11 Class :character 1st Qu.:1e+05
## 2016-03-09T08:44:36Z: 10 Mode :character Median :2e+05
## 2016-07-22T14:19:10Z: 10 Mean :2e+05
## 2016-02-15T14:35:51Z: 9 3rd Qu.:3e+05
## 2016-02-16T16:25:24Z: 9 Max. :4e+05
## (Other) :363178
events <- bupaR::activities_to_eventlog(
data_act,
case_id = 'Case_ID',
activity_id = 'Activity',
resource_id = 'Resource',
timestamps = c('starttimestamp', 'endtimestamp')
)
events %>% summary
## Number of events: 800000
## Number of cases: 57165
## Number of traces: 2917
## Number of distinct activities: 26
## Average trace length: 13.99458
##
## Start eventlog: NA
## End eventlog: NA
## Case_ID Activity Resource
## Length:800000 O_Create Offer : 56048 User_1 : 98752
## Class :character O_Created : 56048 User_49: 15556
## Mode :character O_Sent (mail and online): 51876 User_29: 13864
## W_Validate application : 50960 User_3 : 13610
## A_Validating : 50026 User_10: 13388
## (Other) :461496 (Other):571284
## NA's : 73546 NA's : 73546
## Start_Timestamp Complete_Timestamp Variant Variant_index
## Length:800000 Length:800000 Length:800000 Min. : 1.0
## Class :character Class :character Class :character 1st Qu.: 7.0
## Mode :character Mode :character Mode :character Median : 25.0
## Mean : 390.3
## 3rd Qu.: 257.0
## Max. :3159.0
## NA's :73546
## (case)_ApplicationType (case)_creditGoal (case)_RequestedAmount
## Length:800000 Length:800000 Min. : 0
## Class :character Class :character 1st Qu.: 6000
## Mode :character Mode :character Median : 12000
## Mean : 15618
## 3rd Qu.: 20000
## Max. :450000
## NA's :73546
## Accepted Action CreditScore EventID
## Mode :logical Length:800000 Min. : 0.0 Length:800000
## FALSE:16718 Class :character 1st Qu.: 0.0 Class :character
## TRUE :39330 Mode :character Median : 0.0 Mode :character
## NA's :743952 Mean : 319.9
## 3rd Qu.: 851.0
## Max. :1142.0
## NA's :743952
## EventOrigin FirstWithdrawalAmount MonthlyCost
## Length:800000 Min. : 0 Min. : 43.0
## Class :character 1st Qu.: 0 1st Qu.: 150.0
## Mode :character Median : 5000 Median : 232.8
## Mean : 7681 Mean : 273.9
## 3rd Qu.:10996 3rd Qu.: 340.8
## Max. :75000 Max. :6673.8
## NA's :743952 NA's :743952
## NumberOfTerms OfferID OfferedAmount Selected
## Min. : 5 Length:800000 Min. : 5000 Mode :logical
## 1st Qu.: 56 Class :character 1st Qu.: 8000 FALSE:27830
## Median : 74 Mode :character Median :15000 TRUE :28218
## Mean : 82 Mean :17820 NA's :743952
## 3rd Qu.:120 3rd Qu.:24000
## Max. :180 Max. :75000
## NA's :743952 NA's :743952
## lifecycle:transition activity_instance_id lifecycle_id
## Length:800000 Length:800000 endtimestamp :400000
## Class :character Class :character starttimestamp:400000
## Mode :character Mode :character
##
##
##
##
## timestamp .order
## Min. :2016-01-01 10:51:15 Min. :1e+00
## 1st Qu.:2016-03-19 16:11:02 1st Qu.:2e+05
## Median :2016-06-07 10:26:21 Median :4e+05
## Mean :2016-05-28 15:39:43 Mean :4e+05
## 3rd Qu.:2016-08-02 20:05:06 3rd Qu.:6e+05
## Max. :2017-01-26 10:11:10 Max. :8e+05
## NA's :73546
freq=activity_frequency(events,level = "activity")
freq2=activity_frequency(eventlog,level = "activity")
#?activity_frequency
plot(freq)
plot(freq2)
eventlog %>%
filter_activity_presence(activities = c('A_Validating')) %>%
activity_frequency(level = "activity")
## # A tibble: 25 x 3
## Activity absolute relative
## <fct> <int> <dbl>
## 1 W_Validate application 25480 0.0904
## 2 A_Validating 25013 0.0887
## 3 O_Create Offer 19781 0.0701
## 4 O_Created 19781 0.0701
## 5 O_Sent (mail and online) 18188 0.0645
## 6 O_Returned 15130 0.0537
## 7 W_Call incomplete files 14457 0.0513
## 8 A_Incomplete 14335 0.0508
## 9 W_Call after offers 14228 0.0505
## 10 A_Accepted 14178 0.0503
## # ... with 15 more rows
events %>%
filter_activity_presence(activities = c('A_Validating')) %>%
activity_frequency(level = "activity")
## # A tibble: 25 x 3
## Activity absolute relative
## <fct> <int> <dbl>
## 1 W_Validate application 25480 0.0904
## 2 A_Validating 25013 0.0887
## 3 O_Create Offer 19781 0.0701
## 4 O_Created 19781 0.0701
## 5 O_Sent (mail and online) 18188 0.0645
## 6 O_Returned 15130 0.0537
## 7 W_Call incomplete files 14457 0.0513
## 8 A_Incomplete 14335 0.0508
## 9 W_Call after offers 14228 0.0505
## 10 A_Accepted 14178 0.0503
## # ... with 15 more rows
events %>%
filter_activity_frequency(percentage = 1.0) %>% #most frequent activities
filter_trace_frequency(percentage = .80) %>% #most frequent traces
process_map(render = T)
plot(precedence_matrix(events,type = "absolute"))
plot(precedence_matrix(eventlog,type="absolute"))
trace_explorer(events,coverage = 0.8)
## Warning: Removed 1 rows containing missing values (geom_text).
trace_explorer(eventlog,coverage = 0.8)
dotted_chart(eventlog)
## Joining, by = "Case_ID"
## Warning: Removed 36773 rows containing missing values (geom_point).
gb1=eventlog %>%
group_by(`X.case._ApplicationType`) %>%
throughput_time('log', units = 'hours')
gb1
## # A tibble: 3 x 10
## X.case._Applica~ min q1 median mean q3 max st_dev iqr
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 New credit 0.0781 286. 492. 545. 762. 4034. 323. 476.
## 2 "" NA NA NA NaN NA NA 36773 NA
## 3 Limit raise 0.0597 223. 336. 416. 549. 3038. 270. 326.
## # ... with 1 more variable: NA. <dbl>
plot(gb1)
## Warning: Removed 36773 rows containing non-finite values (stat_boxplot).
gb2=eventlog %>%
group_by(`X.case._creditGoal`) %>%
throughput_time('log', units = 'hours')
gb2
## # A tibble: 14 x 10
## X.case._creditG~ min q1 median mean q3 max st_dev iqr
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Existing credit~ 0.326 318. 524. 576. 765. 3218. 322. 447.
## 2 Home improvement 0.174 305. 481. 543. 760. 2546. 300. 455.
## 3 Car 0.144 245. 411. 504. 758. 4034. 310. 513.
## 4 "" NA NA NA NaN NA NA 36773 NA
## 5 Remaining debt ~ 0.135 380. 732. 724. 884. 3501. 486. 504.
## 6 Not speficied 0.103 326. 619. 583. 782. 2547. 329. 456.
## 7 Unknown 0.0597 216. 358. 441. 732. 3191. 302. 516.
## 8 Caravan / Camper 0.174 226. 333. 445. 730. 2110. 304. 504.
## 9 Tax payments 51.4 296. 425. 545. 757. 2177. 365. 460.
## 10 Extra spending ~ 17.8 266. 440. 506. 755. 1815. 280. 488.
## 11 Motorcycle 46.9 259. 391. 480. 759. 1338. 273. 500.
## 12 Boat 55.0 267. 428. 514. 747. 1536. 291. 480.
## 13 Business goal 134. 265. 591. 566. 758. 1245. 315. 494.
## 14 Debt restructur~ 732. 732. 732. 732. 732. 732. NA 0
## # ... with 1 more variable: NA. <dbl>
plot(gb2)
## Warning: Removed 36773 rows containing non-finite values (stat_boxplot).
###We can see that “Existing credit takeover” process time is the highest one rest with an average of 576 hours.